For the past 13 seasons, the American sports entertainment reality show American Ninja Warrior has been taking the world by storm. The show brings on a variety of different contestents and challenges them to complete obstacles courses at various rounds and stages. If the individual passes the round, they will move onto the next round. Well, we are planning on going on American Ninja Warrior and are trying to get a step up on the competition. Is there a way to predict the obstacles we will see on the courses? Are there certain rounds where different types of obstacles are possible? Well, lets find out.
Throughout this tutorial, we are going to attempt to find trends within different obstacles on American Ninja Warrior in hope to find the best way to predict obstacles we might see and their frequency in future seasons.
For this tutorial, you will need the following libraries:
1. pandas
2. matplotlib.pyplot
3. seaborn
4. sklearn
5. sklearn.linear_model
6. statsmodel.formula.api
Here we are importing all of the necesary libraries and scraping the dataset. Go to https://data.world/ninja/anw-obstacle-history and download the dataset from there. We will then be able to scrape the table utilizing the code below.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import sklearn
from sklearn import linear_model
from sklearn import datasets
from sklearn import metrics
from sklearn import preprocessing
from sklearn.linear_model import LinearRegression
from statsmodels.formula.api import ols
data = pd.read_excel('American Ninja Warrior Obstacle History.xlsx')
data.head()
| Season | Location | Round/Stage | Obstacle Name | Obstacle Order | |
|---|---|---|---|---|---|
| 0 | 1 | Venice | Qualifying | Quintuple Steps | 1 |
| 1 | 1 | Venice | Qualifying | Rope Swing | 2 |
| 2 | 1 | Venice | Qualifying | Rolling Barrel | 3 |
| 3 | 1 | Venice | Qualifying | Jumping Spider | 4 |
| 4 | 1 | Venice | Qualifying | Pipe Slider | 5 |
Here we have a dataset that we were able to get from awesome-public-datasets on GitHub. Above is the first couple of rows from the dataset. This dataset contains all of the different obstacles that were present in the first 10 seasons of American Ninja Warrior. Within each row, we can see the name of the obstacle, the season the obstacle was in, the location of the round, the round/stage, and the number obstacle it was on the course. So to begin, we first want to see if there is even anything to be looking at. If all obstacles are only seen once, then there is no reason to be trying to plan what obstacles we might see.
#creating new dataframe to add obstacles to
obstacles = pd.DataFrame(columns=['obstacles','count'])
#grouping by obstacle
groups = data.groupby(['Obstacle Name'])
#looping through and adding the length of each obstacle chart and obstacle to the new dataframe
for name, group in groups:
if len(group)>10:
obstacles.loc[len(obstacles.index)] = [name, len(group)]
else:
obstacles.loc[len(obstacles.index)] = ["", len(group)]
plt.pie(obstacles['count'],labels = obstacles['obstacles'],textprops={'fontsize': 100})
plt.rcParams["figure.figsize"]=(200,200)
plt.show()